86 research outputs found

    Using Deep Learning and Google Street View to Estimate the Demographic Makeup of the US

    Full text link
    The United States spends more than $1B each year on initiatives such as the American Community Survey (ACS), a labor-intensive door-to-door study that measures statistics relating to race, gender, education, occupation, unemployment, and other demographic factors. Although a comprehensive source of data, the lag between demographic changes and their appearance in the ACS can exceed half a decade. As digital imagery becomes ubiquitous and machine vision techniques improve, automated data analysis may provide a cheaper and faster alternative. Here, we present a method that determines socioeconomic trends from 50 million images of street scenes, gathered in 200 American cities by Google Street View cars. Using deep learning-based computer vision techniques, we determined the make, model, and year of all motor vehicles encountered in particular neighborhoods. Data from this census of motor vehicles, which enumerated 22M automobiles in total (8% of all automobiles in the US), was used to accurately estimate income, race, education, and voting patterns, with single-precinct resolution. (The average US precinct contains approximately 1000 people.) The resulting associations are surprisingly simple and powerful. For instance, if the number of sedans encountered during a 15-minute drive through a city is higher than the number of pickup trucks, the city is likely to vote for a Democrat during the next Presidential election (88% chance); otherwise, it is likely to vote Republican (82%). Our results suggest that automated systems for monitoring demographic trends may effectively complement labor-intensive approaches, with the potential to detect trends with fine spatial resolution, in close to real time.Comment: 41 pages including supplementary material. Under review at PNA

    Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom

    Get PDF
    Hi-C experiments study how genomes fold in 3D, generating contact maps containing features as small as 20 bp and as large as 200 Mb. Here we introduce Juicebox, a tool for exploring Hi-C and other contact map data. Juicebox allows users to zoom in and out of Hi-C maps interactively, just as a user of Google Earth might zoom in and out of a geographic map. Maps can be compared to one another, or to 1D tracks or 2D feature sets.National Institutes of Health (U.S.) (NIH New Innovator Award (1DP2OD008540- 01))National Human Genome Research Institute (U.S.) ((NHGRI) Centers of Excellence in Genomic Science (P50HG006193))NVIDIA CorporationInternational Business Machines Corporation (IBM University Challenge Award)Google (Firm) (Google Research Award)Baylor College of Medicine (McNair Medical Institute Scholar Award)Cancer Prevention and Research Institute of Texas (Scholar Award (R1304))Presidential Early Career Award for Scientists and EngineersNational Science Foundation (U.S.) (NSF Physics Frontiers Centers (Center for Theoretical Biological Physics))Robert A. Welch FoundationNational Institute of General Medical Sciences (U.S.) (NIGMS R01GM074024)National Human Genome Research Institute (U.S.) (NHGRI (HG003067)

    Hi-C: A Method to Study the Three-dimensional Architecture of Genomes.

    Get PDF
    The three-dimensional folding of chromosomes compartmentalizes the genome and and can bring distant functional elements, such as promoters and enhancers, into close spatial proximity 2-6. Deciphering the relationship between chromosome organization and genome activity will aid in understanding genomic processes, like transcription and replication. However, little is known about how chromosomes fold. Microscopy is unable to distinguish large numbers of loci simultaneously or at high resolution. To date, the detection of chromosomal interactions using chromosome conformation capture (3C) and its subsequent adaptations required the choice of a set of target loci, making genome-wide studies impossible 7-10

    Static and Dynamic DNA Loops form AP-1-Bound Activation Hubs during Macrophage Development

    Get PDF
    The three-dimensional arrangement of the human genome comprises a complex network of structural and regulatory chromatin loops important for coordinating changes in transcription during human development. To better understand the mechanisms underlying context-specific 3D chromatin structure and transcription during cellular differentiation, we generated comprehensive in situ Hi-C maps of DNA loops during human monocyte-to-macrophage differentiation. We demonstrate that dynamic looping events are regulatory rather than structural in nature and uncover widespread coordination of dynamic enhancer activity at preformed and acquired DNA loops. Enhancer-bound loop formation and enhancer-activation of preformed loops represent two distinct modes of regulation that together form multi-loop activation hubs at key macrophage genes. Activation hubs connect 3.4 enhancers per promoter and exhibit a strong enrichment for Activator Protein 1 (AP-1) binding events, suggesting multi-loop activation hubs driven by cell-type specific transcription factors may represent an important class of regulatory chromatin structures for the spatiotemporal control of transcription

    A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping

    Get PDF
    SummaryWe use in situ Hi-C to probe the 3D architecture of genomes, constructing haploid and diploid maps of nine cell types. The densest, in human lymphoblastoid cells, contains 4.9 billion contacts, achieving 1 kb resolution. We find that genomes are partitioned into contact domains (median length, 185 kb), which are associated with distinct patterns of histone marks and segregate into six subcompartments. We identify ∼10,000 loops. These loops frequently link promoters and enhancers, correlate with gene activation, and show conservation across cell types and species. Loop anchors typically occur at domain boundaries and bind CTCF. CTCF sites at loop anchors occur predominantly (>90%) in a convergent orientation, with the asymmetric motifs “facing” one another. The inactive X chromosome splits into two massive domains and contains large loops anchored at CTCF-binding repeats.PaperFlic

    Deletion of DXZ4 on the human inactive X chromosome alters higher-order genome architecture

    Get PDF
    During interphase, the inactive X chromosome (Xi) is largely transcriptionally silent and adopts an unusual 3D configuration known as the "Barr body." Despite the importance of X chromosome inactivation, little is known about this 3D conformation. We recently showed that in humans the Xi chromosome exhibits three structural features, two of which are not shared by other chromosomes. First, like the chromosomes of many species, Xi forms compartments. Second, Xi is partitioned into two huge intervals, called "superdomains," such that pairs of loci in the same superdomain tend to colocalize. The boundary between the superdomains lies near DXZ4, a macrosatellite repeat whose Xi allele extensively binds the protein CCCTC-binding factor. Third, Xi exhibits extremely large loops, up to 77 megabases long, called "superloops." DXZ4 lies at the anchor of several superloops. Here, we combine 3D mapping, microscopy, and genome editing to study the structure of Xi, focusing on the role of DXZ4 We show that superloops and superdomains are conserved across eutherian mammals. By analyzing ligation events involving three or more loci, we demonstrate that DXZ4 and other superloop anchors tend to colocate simultaneously. Finally, we show that deleting DXZ4 on Xi leads to the disappearance of superdomains and superloops, changes in compartmentalization patterns, and changes in the distribution of chromatin marks. Thus, DXZ4 is essential for proper Xi packaging.National Human Genome Research Institute (U.S.) (Grant HG003067

    Cohesin depleted cells pass through mitosis and reconstitute a functional nuclear architecture

    Get PDF
    The human genome forms thousands of “contact domains”, which are intervals of enhanced contact frequency. Some, called “loop domains” are thought to form by cohesin-mediated loop extrusion. Others, called “compartmental domains”, form due to the segregation of active and inactive chromatin into A and B compartments. Recently, Hi-C studies revealed that the depletion of cohesin leads to the disappearance of all loop domains within a few hours, but strengthens compartment structure. Here, we combine live cell microscopy, super-resolution microscopy, Hi-C, and studies of replication timing to examine the longer-term consequences of cohesin degradation in HCT-116 human colorectal carcinoma cells, tracking cells for up to 30 hours. Surprisingly, cohesin depleted cells proceed through an aberrant mitosis, yielding a single postmitotic cell with a multilobulated nucleus. Hi-C reveals the continued disappearance of loop domains, whereas A and B compartments are maintained. In line with Hi-C, microscopic observations demonstrate the reconstitution of chromosome territories and chromatin domains. An interchromatin channel system (IC) expands between chromatin domain clusters and carries splicing speckles. The IC is lined by active chromatin enriched for RNA Pol II and depleted in H3K27me3. Moreover, the cells exhibit typical early-, mid-, and late- DNA replication timing patterns. Our observations indicate that the functional nuclear compartmentalization can be maintained in cohesin depleted pre- and postmitotic cells. However, we find that replication foci – sites of active DNA synthesis – become physically larger consistent with a model where cohesin dependent loop extrusion tends to compact intervals of replicating chromatin, whereas their genomic boundaries are associated with compartmentalization, and do not change.3D FISH3D fluorescence in situ hybridization3D SIM3D structured illumination microscopyAIDauxin inducible degronANC / INCactive / inactive nuclear compartmentCTchromosome territoryCD(C)chromatin domain (cluster)CTCFCCCTC binding factorDAPI4’,6-diamidino-2-phenylindoleEdU5-Ethynyl-2’-deoxyuridineHi-Cchromosome conformation capturing combined with deep sequencingICinterchromatin compartmentMLNmultilobulated nucleusNCnucleosome clusterPBSphosphate buffered salinePBSTphosphate buffered saline with 0.02% TweenPRperichromatin regionRDreplication domainRLreplication labelingTADtopologically associating domai

    Cohesin depleted cells rebuild functional nuclear compartments after endomitosis

    Get PDF
    Cohesin plays an essential role in chromatin loop extrusion, but its impact on a compartmentalized nuclear architecture, linked to nuclear functions, is less well understood. Using live-cell and super-resolved 3D microscopy, here we find that cohesin depletion in a human colon cancer derived cell line results in endomitosis and a single multilobulated nucleus with chromosome territories pervaded by interchromatin channels. Chromosome territories contain chromatin domain clusters with a zonal organization of repressed chromatin domains in the interior and transcriptionally competent domains located at the periphery. These clusters form microscopically defined, active and inactive compartments, which likely correspond to A/B compartments, which are detected with ensemble Hi-C. Splicing speckles are observed nearby within the lining channel system. We further observe that the multilobulated nuclei, despite continuous absence of cohesin, pass through S-phase with typical spatio-temporal patterns of replication domains. Evidence for structural changes of these domains compared to controls suggests that cohesin is required for their full integrity

    The Australasian dingo archetype: de novo chromosome-length genome assembly, DNA methylome, and cranial morphology

    Get PDF
    BACKGROUND: One difficulty in testing the hypothesis that the Australasian dingo is a functional intermediate between wild wolves and domesticated breed dogs is that there is no reference specimen. Here we link a high-quality de novo long-read chromosomal assembly with epigenetic footprints and morphology to describe the Alpine dingo female named Cooinda. It was critical to establish an Alpine dingo reference because this ecotype occurs throughout coastal eastern Australia where the first drawings and descriptions were completed. FINDINGS: We generated a high-quality chromosome-level reference genome assembly (Canfam_ADS) using a combination of Pacific Bioscience, Oxford Nanopore, 10X Genomics, Bionano, and Hi-C technologies. Compared to the previously published Desert dingo assembly, there are large structural rearrangements on chromosomes 11, 16, 25, and 26. Phylogenetic analyses of chromosomal data from Cooinda the Alpine dingo and 9 previously published de novo canine assemblies show dingoes are monophyletic and basal to domestic dogs. Network analyses show that the mitochondrial DNA genome clusters within the southeastern lineage, as expected for an Alpine dingo. Comparison of regulatory regions identified 2 differentially methylated regions within glucagon receptor GCGR and histone deacetylase HDAC4 genes that are unmethylated in the Alpine dingo genome but hypermethylated in the Desert dingo. Morphologic data, comprising geometric morphometric assessment of cranial morphology, place dingo Cooinda within population-level variation for Alpine dingoes. Magnetic resonance imaging of brain tissue shows she had a larger cranial capacity than a similar-sized domestic dog. CONCLUSIONS: These combined data support the hypothesis that the dingo Cooinda fits the spectrum of genetic and morphologic characteristics typical of the Alpine ecotype. We propose that she be considered the archetype specimen for future research investigating the evolutionary history, morphology, physiology, and ecology of dingoes. The female has been taxidermically prepared and is now at the Australian Museum, Sydney
    corecore